This series of files compile analyses done for the specific analysis of Chapter 1, for the regional campaign of 2017.

All analyses have been done with PRIMER-e 6 and R 4.0.1.

Click on the table of contents in the left margin to assess a specific analysis.
Click on a figure to zoom it


We used data from subtidal ecosystems (see metadata files for more information). Only stations that have been sampled both for abiotic parameters and benthic species were included.

Selected variables for the analyses:

We only considered 500 µm communities for these analyses. Abundances of Bipalponephtys neotena (Bneo) and Nematoda (Nema) were also considered (see IndVal and SIMPER results).

Heavy metal concentrations for campaign 2017 have been kriged from the values collected at campaigns 2014 and 2016. As data is missing for metal concentrations outside BSI, two Designs have been used:


1. Data manipulation

For the following analyses, independant variables are habitat parameters and heavy metal concentrations, dependant variables are diversity indices. Variables have been standardized by mean and standard-deviation.

1.1. Identification of outliers

To identify stations that are not consistent with the others, we used the multivariate Cook’s Distance (CD) on the uncorrelated variables. A significative threshold of 4 times the mean of CD has been established.

Design 1

We identified stations 188, 194 and 228 as general outliers. They have been deleted for the following analyses of Design 1.

Design 2

We identified stations 132 and 154 as general outliers. They have been deleted for the following analyses of Design 2.

1.2. Correlations between parameters

Correlations have been calculated with Spearman’s rank coefficient.

Design 1

According to these results, the following variables are highly correlated (\(|\rho|\) > 0.80) so they have been considered together in the regressions of Design 1:

  • sand and silt (silt deleted)
Correlation coefficients between habitat parameters (Design 1)
  om gravel sand silt clay
om 1 -0.365 -0.791 0.877 0.119
gravel -0.365 1 0.015 -0.325 0.183
sand -0.791 0.015 1 -0.896 -0.47
silt 0.877 -0.325 -0.896 1 0.212
clay 0.119 0.183 -0.47 0.212 1

Design 2

Many variables are highly correlated (\(|\rho|\) > 0.80), but we have considered the following together in the regressions of Design 2:

  • chromium, iron and manganese (iron and manganese deleted)
  • copper, lead and zinc (copper and zinc deleted)

We decided to keep arsenic, even though it is correlated with the copper/lead/zinc group, to stay consistant with the 2014 and 2016 campaigns.

Correlation coefficients between heavy metals concentrations (Design 2)
  arsenic cadmium chromium copper iron manganese mercury lead zinc
arsenic 1 0.658 0.838 0.801 0.701 0.598 0.438 0.8 0.874
cadmium 0.658 1 0.809 0.467 0.719 0.794 0.288 0.806 0.777
chromium 0.838 0.809 1 0.811 0.871 0.84 0.406 0.889 0.945
copper 0.801 0.467 0.811 1 0.688 0.573 0.308 0.805 0.896
iron 0.701 0.719 0.871 0.688 1 0.834 0.182 0.71 0.848
manganese 0.598 0.794 0.84 0.573 0.834 1 0.208 0.679 0.77
mercury 0.438 0.288 0.406 0.308 0.182 0.208 1 0.488 0.37
lead 0.8 0.806 0.889 0.805 0.71 0.679 0.488 1 0.924
zinc 0.874 0.777 0.945 0.896 0.848 0.77 0.37 0.924 1

2. Permutational Analyses of Covariance

Results of univariate PermANCOVAs on parameters and multivariate PermANCOVA on the whole benthic community with depth as covariate are presented in the table below. Variables have been standardized by mean and standard-deviation, and taxon densities were (log+1) transformed.

Variable Condition Depth
om S
gravel
sand
silt S
clay
S (500 µm) S
N (500 µm)
H (500 µm) S
J (500 µm) S
ALL SPECIES (500 µm) S S

3. Similarity and characteristic species

Let’s have a look at the \(\beta\) diversity within our conditions and sites.

Results of the PERMDISP routine are shown below (mean and SE of the deviation from centroid for each group, i.e. multivariate dispersion), along with the mean Bray-Curtis dissimilarity for each group. Taxon densities were (log+1) transformed and PRIMER was used to do the PERMDISP.

Mean within-group Bray-Curtis dissimilarity for each condition or site
  Mean deviation SE of deviation Mean BC dissimilarity
HI 57.5 1.64 0.829
R 48.8 2.43 0.71

Significative differences in dispersion have been detected between HI and R (p = 0.0053).

The following analyses allowed to detect species as characteristic of each condition. We used results from PRIMER to justify further their choice.

##                        cluster indicator_value probability
## bipalponephtys_neotena       1          0.6390       0.001
## macoma_calcarea              1          0.5583       0.004
## goniada_maculata             1          0.2917       0.028
## nematoda                     2          0.6565       0.005
## echinarachnius_parma         2          0.6524       0.001
## crenella_decussata           2          0.5836       0.001
## ecrobia_truncata             2          0.3333       0.005
## mesodesma_arctatum           2          0.2667       0.016
## solariella_sp                2          0.2000       0.039
## 
## Sum of probabilities                 =  55.401 
## 
## Sum of Indicator Values              =  13.26 
## 
## Sum of Significant Indicator Values  =  4.18 
## 
## Number of Significant Indicators     =  9 
## 
## Significant Indicator Distribution
## 
## 1 2 
## 3 6
SIMPER results (mean Bray-Curtis between-group dissimilarity: 0.872)
  average sd ratio ava avb cumsum
nematoda 0.0743 0.0621 1.2 0.879 2.08 0.0852
echinarachnius_parma 0.0618 0.0632 0.977 0.361 1.59 0.156
bipalponephtys_neotena 0.0556 0.0485 1.15 1.78 0.193 0.22
eudorellopsis_integra 0.0358 0.0494 0.724 1.08 0.119 0.261
crenella_decussata 0.0346 0.0352 0.983 0.0289 1.03 0.3
mesodesma_arctatum 0.0338 0.064 0.528 0 0.751 0.339
macoma_calcarea 0.0316 0.0323 0.977 1.03 0.0462 0.375
harpacticoida 0.0283 0.0325 0.872 0.801 0.258 0.408
phoxocephalus_holbolli 0.0282 0.0329 0.857 0.411 0.658 0.44
amphipoda 0.0234 0.0274 0.852 0.483 0.346 0.467
pholoe_sp 0.0215 0.0249 0.866 0.413 0.434 0.492
ameritella_agilis 0.018 0.0252 0.715 0.132 0.451 0.512
ecrobia_truncata 0.0168 0.0321 0.525 0 0.575 0.532
ennucula_tenuis 0.0164 0.022 0.742 0.401 0.212 0.55
ischyrocerus_anguipes 0.0155 0.0327 0.475 0.406 0.13 0.568
akanthophoreus_gracilis 0.0154 0.0263 0.584 0.389 0.193 0.586
hiatella_arctica 0.015 0.0309 0.483 0.0747 0.368 0.603
ostracoda 0.013 0.0195 0.664 0.207 0.258 0.618
cistenides_granulata 0.0129 0.0202 0.641 0.233 0.212 0.633
axinopsida_orbiculata 0.0117 0.025 0.47 0.361 0 0.646
thracia_septentrionalis 0.0117 0.0185 0.633 0.183 0.266 0.66
mysella_planulata 0.0117 0.0263 0.445 0 0.333 0.673
sabellidae_spp 0.0112 0.0323 0.348 0.361 0.0462 0.686
leucon_leucon_nasicoides 0.0107 0.0237 0.454 0.294 0 0.698
nephtyidae_spp 0.0107 0.0176 0.608 0.274 0.0462 0.71
orchomenella_minuta 0.0102 0.019 0.539 0.0578 0.212 0.722
nephtys_caeca 0.00966 0.0194 0.497 0.116 0.119 0.733
goniada_maculata 0.00965 0.0165 0.585 0.286 0 0.744
mytilus_sp 0.0092 0.0268 0.342 0.287 0 0.755
solariella_sp 0.0089 0.0205 0.435 0 0.29 0.765
pontoporeia_femorata 0.00864 0.0245 0.352 0.225 0 0.775
parvicardium_pinnulatum 0.00863 0.0164 0.527 0.0578 0.227 0.785
caprella_septentrionalis 0.00844 0.029 0.291 0.34 0 0.795
protomedeia_fasciata 0.00786 0.0165 0.478 0.241 0 0.804
lamprops_fuscatus 0.00694 0.0128 0.543 0.207 0.0462 0.811
pholoe_longa 0.00605 0.0189 0.319 0.11 0.0732 0.818
protomedeia_grandimana 0.00592 0.0173 0.342 0.213 0 0.825
astarte_sp 0.00588 0.013 0.454 0.0747 0.119 0.832
chone_sp 0.00567 0.0282 0.201 0.107 0 0.838
monoculopsis_longicornis 0.00559 0.0144 0.388 0.125 0.0924 0.845
ophelia_limacina 0.00551 0.0112 0.494 0.0289 0.166 0.851
glycera_sp 0.00542 0.0171 0.317 0.116 0 0.857
chaetodermatida 0.00541 0.0117 0.462 0.144 0.0462 0.864
nephtys_incisa 0.00535 0.0126 0.423 0.104 0.0462 0.87
maldanidae_spp 0.00472 0.0156 0.303 0.113 0.0462 0.875
aceroides_aceroides_latipes 0.00452 0.0111 0.406 0.144 0 0.88
quasimelita_formosa 0.00447 0.0122 0.366 0.125 0 0.885
sipuncula 0.00428 0.0122 0.35 0 0.0924 0.89
euchone_sp 0.00426 0.0207 0.206 0.146 0 0.895
anthozoa 0.00409 0.0111 0.369 0 0.119 0.9

4. Univariate regressions

We used linear models for the all regressions on diversity indices. Outliers and correlated variables were removed from these analyses. Variables have been standardized by mean and standard-deviation (coefficients need to be back-transformed to be used in predictive models).

4.1. Simple regressions

These analyses have been do to explore the relationships between variables. As it is a huge number of results to interpret, only multiple regressions will be included in the article (see below).

Depth has been shown important for several parameters in the ANCOVAs, so here are the corresponding scatterplots.

Design 1
Adjusted R-squared of simple regressions for Design 1
  om gravel sand clay
S_500 -0.02767 0.01197 -0.001571 -0.02831
N_500 -0.0185 -0.02098 -0.02294 0.002618
H_500 -0.009374 -0.0139 0.02245 0.02041
J_500 0.02241 -0.02941 0.005738 0.02843
p-values of simple regressions for Design 1
  om gravel sand clay
S_500 0.8117 0.241 0.3378 0.8497
N_500 0.5501 0.5997 0.6458 0.3034
H_500 0.417 0.4757 0.1882 0.1973
J_500 0.1883 0.9884 0.2806 0.1639
Design 2

Quitting from lines 266-280 (C1_analyses_17B.Rmd) Error in pandoc.table.return(…) : Wrong number of parameters (7 instead of 6) passed: justify De plus : Warning messages: 1: attribute variables are assumed to be spatially constant throughout all geometries 2: attribute variables are assumed to be spatially constant throughout all geometries 3: attribute variables are assumed to be spatially constant throughout all geometries 4: attribute variables are assumed to be spatially constant throughout all geometries 5: attribute variables are assumed to be spatially constant throughout all geometries 6: attribute variables are assumed to be spatially constant throughout all geometries 7: attribute variables are assumed to be spatially constant throughout all geometries 8: attribute variables are assumed to be spatially constant throughout all geometries Quitting from lines 266-280 (C1_analyses_17B.Rmd) Error in pandoc.table.return(…) : Wrong number of parameters (7 instead of 6) passed: justify

4.2. Multiple regressions

This section presents analyses done to determine (i) which model (Design 1, Design 2) decribes the best the parameters and (ii) which variables are the most important to explain the parameters.

4.2.1. Best model selection

This step was not used here as each models are necessary.

4.2.2. Significative variables selection

We identified which variables were selected after an AIC procedure to predict the best the parameters. Results of the variable selection, according to AIC, are shown on the tables below:

  • for the model of Design 1
Variable (or combination) S_500 N_500 H_500 J_500
om +
gravel +
sand/silt +
clay +
Adjusted \(R^{2}\) 0 0 0 0.06
  • for the model of Design 2
Variable (or combination) S_500 N_500 H_500 J_500
arsenic - - -
cadmium +
chromium/iron/manganese - - -
mercury - -
lead/copper/zinc + + +
Adjusted \(R^{2}\) 0.24 0.26 0.24 0.09

Details of the regressions, with diagnostics and cross-validation, are summarized below.

Design 1
Richness
## FULL MODEL
## Adjusted R2 is: -0.06
Fitting linear model: S_500 ~ om + gravel + sand + clay
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.09546 0.1757 0.5434 0.5907
om -0.2095 0.5347 -0.3918 0.6979
gravel 0.07971 0.2617 0.3046 0.7627
sand -0.3523 0.5343 -0.6593 0.5146
clay -0.1332 0.2967 -0.4489 0.6566
## RMSE from cross-validation: 10.96324
Variance Inflation Factors
  om gravel sand clay
VIF 2.3 1.55 2.93 1.76

## REDUCED MODEL
## Adjusted R2 is: 0
Fitting linear model: S_500 ~ 1
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.07403 0.1675 0.4419 0.6613
## RMSE from cross-validation: 0.9974367

Quitting from lines 329-331 (C1_analyses_17B.Rmd) Error in Qr\(qr[p1, p1, drop = FALSE] : indice hors limites De plus : Warning messages: 1: In CVlm(data = lm_out\)model, form.lm = lm_out, m = 5, printit = F) :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

2: In CVlm(data = lm_out$model, form.lm = lm_out, m = 5, printit = F) :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

Density
## FULL MODEL
## Adjusted R2 is: -0.06
Fitting linear model: N_500 ~ om + gravel + sand + clay
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.03898 0.1774 0.2197 0.8275
om -0.4389 0.54 -0.8128 0.4225
gravel -0.07614 0.2642 -0.2882 0.7751
sand -0.3968 0.5396 -0.7354 0.4676
clay -0.34 0.2996 -1.135 0.2652
## RMSE from cross-validation: 17.81878
Variance Inflation Factors
  om gravel sand clay
VIF 2.3 1.55 2.93 1.76

## REDUCED MODEL
## Adjusted R2 is: 0
Fitting linear model: N_500 ~ 1
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0233 0.1689 0.1379 0.8911
## RMSE from cross-validation: 1.082179

Quitting from lines 341-343 (C1_analyses_17B.Rmd) Error in Qr\(qr[p1, p1, drop = FALSE] : indice hors limites De plus : Warning messages: 1: In CVlm(data = lm_out\)model, form.lm = lm_out, m = 5, printit = F) :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

2: In CVlm(data = lm_out$model, form.lm = lm_out, m = 5, printit = F) :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

Diversity
## FULL MODEL
## Adjusted R2 is: -0.03
Fitting linear model: H_500 ~ om + gravel + sand + clay
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.05635 0.1667 0.3381 0.7376
om 0.4183 0.5073 0.8246 0.4159
gravel 0.2463 0.2483 0.9921 0.3288
sand 0.2698 0.507 0.5323 0.5983
clay 0.3124 0.2815 1.11 0.2756
## RMSE from cross-validation: 27.97158
Variance Inflation Factors
  om gravel sand clay
VIF 2.3 1.55 2.93 1.76

## REDUCED MODEL
## Adjusted R2 is: 0
Fitting linear model: H_500 ~ 1
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.06334 0.1616 0.392 0.6975
## RMSE from cross-validation: 0.9736631

Quitting from lines 353-355 (C1_analyses_17B.Rmd) Error in Qr\(qr[p1, p1, drop = FALSE] : indice hors limites De plus : Warning messages: 1: In CVlm(data = lm_out\)model, form.lm = lm_out, m = 5, printit = F) :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

2: In CVlm(data = lm_out$model, form.lm = lm_out, m = 5, printit = F) :

As there is >1 explanatory variable, cross-validation predicted values for a fold are not a linear function of corresponding overall predicted values. Lines that are shown for the different folds are approximate

Evenness
## FULL MODEL
## Adjusted R2 is: 0.06
Fitting linear model: J_500 ~ om + gravel + sand + clay
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.006409 0.1518 0.04221 0.9666
om 0.9141 0.4621 1.978 0.05686
gravel 0.3201 0.2262 1.415 0.1669
sand 0.7476 0.4618 1.619 0.1156
clay 0.5227 0.2564 2.038 0.05012
## RMSE from cross-validation: 26.77056
Variance Inflation Factors
  om gravel sand clay
VIF 2.3 1.55 2.93 1.76

## REDUCED MODEL
## Adjusted R2 is: 0.06
Fitting linear model: J_500 ~ om + gravel + sand + clay
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.006409 0.1518 0.04221 0.9666
om 0.9141 0.4621 1.978 0.05686
gravel 0.3201 0.2262 1.415 0.1669
sand 0.7476 0.4618 1.619 0.1156
clay 0.5227 0.2564 2.038 0.05012
## RMSE from cross-validation: 26.77056
Variance Inflation Factors
  om gravel sand clay
VIF 2.3 1.55 2.93 1.76

Design 2
Richness
## FULL MODEL
## Adjusted R2 is: 0.21
Fitting linear model: S_500 ~ arsenic + cadmium + chromium + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.2519 0.2374 -1.061 0.3045
arsenic -1.006 0.7414 -1.356 0.1938
cadmium -0.1794 0.3118 -0.5752 0.5731
chromium -0.6905 0.6333 -1.09 0.2917
mercury -0.6837 0.489 -1.398 0.1812
lead 1.029 0.7637 1.347 0.1966
## RMSE from cross-validation: 1.145083
Variance Inflation Factors
  arsenic cadmium chromium mercury lead
VIF 2.11 1.6 3.14 1.22 3.69

## REDUCED MODEL
## Adjusted R2 is: 0.24
Fitting linear model: S_500 ~ arsenic + chromium + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.247 0.2326 -1.062 0.3031
arsenic -1.014 0.7265 -1.396 0.1807
chromium -0.8021 0.5908 -1.358 0.1923
mercury -0.6267 0.4693 -1.335 0.1994
lead 0.985 0.7447 1.323 0.2035
## RMSE from cross-validation: 1.012635
Variance Inflation Factors
  arsenic chromium mercury lead
VIF 2.11 2.99 1.2 3.67

Density
## FULL MODEL
## Adjusted R2 is: 0.18
Fitting linear model: N_500 ~ arsenic + cadmium + chromium + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.2115 0.2383 -0.8875 0.3879
arsenic -0.6111 0.7441 -0.8212 0.4236
cadmium 0.4067 0.3129 1.3 0.2121
chromium -0.07125 0.6356 -0.1121 0.9121
mercury -0.7709 0.4908 -1.571 0.1358
lead -0.1252 0.7665 -0.1633 0.8723
## RMSE from cross-validation: 1.056123
Variance Inflation Factors
  arsenic cadmium chromium mercury lead
VIF 2.11 1.6 3.14 1.22 3.69

## REDUCED MODEL
## Adjusted R2 is: 0.26
Fitting linear model: N_500 ~ arsenic + cadmium + mercury
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.2439 0.2092 -1.166 0.2589
arsenic -0.8016 0.4748 -1.688 0.1086
cadmium 0.3459 0.2472 1.399 0.1788
mercury -0.8427 0.4205 -2.004 0.06036
## RMSE from cross-validation: 0.9536134
Variance Inflation Factors
  arsenic cadmium mercury
VIF 1.42 1.33 1.11

Diversity
## FULL MODEL
## Adjusted R2 is: 0.24
Fitting linear model: H_500 ~ arsenic + cadmium + chromium + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.2432 0.2299 -1.058 0.3059
arsenic -1.098 0.7179 -1.529 0.1458
cadmium -0.2655 0.3019 -0.8792 0.3923
chromium -1.18 0.6132 -1.924 0.07233
mercury -0.604 0.4735 -1.276 0.2203
lead 1.745 0.7395 2.359 0.03135 *
## RMSE from cross-validation: 1.218505
Variance Inflation Factors
  arsenic cadmium chromium mercury lead
VIF 2.11 1.6 3.14 1.22 3.69

## REDUCED MODEL
## Adjusted R2 is: 0.24
Fitting linear model: H_500 ~ arsenic + chromium + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.1355 0.2117 -0.6398 0.5304
arsenic -1.021 0.7139 -1.43 0.1698
chromium -1.287 0.5819 -2.212 0.04012 *
lead 1.434 0.7029 2.04 0.05632
## RMSE from cross-validation: 0.9705081
Variance Inflation Factors
  arsenic chromium lead
VIF 2.09 2.98 3.51

Evenness
## FULL MODEL
## Adjusted R2 is: -0.07
Fitting linear model: J_500 ~ arsenic + cadmium + chromium + mercury + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0006482 0.2717 0.002385 0.9981
arsenic -0.2288 0.8484 -0.2697 0.7908
cadmium -0.04213 0.3568 -0.1181 0.9075
chromium -1.262 0.7247 -1.741 0.1008
mercury 0.04122 0.5595 0.07367 0.9422
lead 1.368 0.8739 1.565 0.1371
## RMSE from cross-validation: 1.475412
Variance Inflation Factors
  arsenic cadmium chromium mercury lead
VIF 2.11 1.6 3.14 1.22 3.69

## REDUCED MODEL
## Adjusted R2 is: 0.09
Fitting linear model: J_500 ~ chromium + lead
  Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.01992 0.2107 0.09453 0.9257
chromium -1.294 0.6326 -2.045 0.055
lead 1.259 0.6494 1.938 0.06757
## RMSE from cross-validation: 1.348883
Variance Inflation Factors
  chromium lead
VIF 2.98 2.98

5. Multivariate regressions

Independant variables are habitat parameters or heavy metal concentrations, dependant variables are species abundances. Variables have been standardized by mean and standard-deviation, and outliers and correlated variables have been excluded. Taxon densities were (log+1) transformed.

This analysis has been done on PRIMER, with a DistLM to identify the variables that explain the most the community variability and with a dbRDA to plot the results.

Design 1

Variables selected by the DistLM procedure have a \(R^{2}\) of 0.23.

Design 2

Variables selected by the DistLM procedure have a \(R^{2}\) of 0.13.


| 📄